Goto

Collaborating Authors

 shap dependence plot


Explanation of Machine Learning Models of Colon Cancer Using SHAP Considering Interaction Effects

arXiv.org Artificial Intelligence

When using machine learning techniques in decision-making processes, the interpretability of the models is important. Shapley additive explanation (SHAP) is one of the most promising interpretation methods for machine learning models. Interaction effects occur when the effect of one variable depends on the value of another variable. Even if each variable has little effect on the outcome, its combination can have an unexpectedly large impact on the outcome. Understanding interactions is important for understanding machine learning models; however, naive SHAP analysis cannot distinguish between the main effect and interaction effects. In this paper, we introduce the Shapley-Taylor index as an interpretation method for machine learning models using SHAP considering interaction effects. We apply the method to the cancer cohort data of Kyushu University Hospital (N=29,080) to analyze what combination of factors contributes to the risk of colon cancer.


SHAP for additively modeled features in a boosted trees model

arXiv.org Artificial Intelligence

An important technique to explore a black-box machine learning (ML) model is called SHAP (SHapley Additive exPlanation). SHAP values decompose predictions into contributions of the features in a fair way. We will show that for a boosted trees model with some or all features being additively modeled, the SHAP dependence plot of such a feature corresponds to its partial dependence plot up to a vertical shift. We illustrate the result with XGBoost.


Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital

arXiv.org Machine Learning

When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods. In addition, we showed how the A/G ratio works as an important prognostic factor for cerebral infarction using our hospital data and proposed techniques.


Consistent Individualized Feature Attribution for Tree Ensembles

arXiv.org Machine Learning

Interpreting predictions from tree ensemble methods such as gradient boosting machines and random forests is important, yet feature attribution for trees is often heuristic and not individualized for each prediction. Here we show that popular feature attribution methods are inconsistent, meaning they can lower a feature's assigned importance when the true impact of that feature actually increases. This is a fundamental problem that casts doubt on any comparison between features. To address it we turn to recent applications of game theory and develop fast exact tree solutions for SHAP (SHapley Additive exPlanation) values, which are the unique consistent and locally accurate attribution values. We then extend SHAP values to interaction effects and define SHAP interaction values. We propose a rich visualization of individualized feature attributions that improves over classic attribution summaries and partial dependence plots, and a unique "supervised" clustering (clustering based on feature attributions). We demonstrate better agreement with human intuition through a user study, exponential improvements in run time, improved clustering performance, and better identification of influential features. An implementation of our algorithm has also been merged into XGBoost and LightGBM, see http://github.com/slundberg/shap for details.